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Integration of radar systems as primary sensor with deep learning algorithms 
in driver assist systems is still limited. Its implementation would greatly help 
in continuous monitoring of visual blind spots from incoming pedestrians. 
Hence, this study proposes a single-input single-output based Doppler radar 
and long short-term memory (LSTM) neural network for pedestrian 


detection. The radar is placed in monostatic configuration at an angle of 45 


degree from line of sight. Continuous wave with frequency of 1.9 GHz are 
Keywords: continuously transmitted from the antenna. The returning signal from the 
approaching subjects is characterized by the branching peaks higher than the 
transmitted frequency. A total of 1108 spectrum traces with Doppler shifts 
LSTM characteristics is acquired from eight volunteers. Another 1108 spectrum 
Neural network traces without Doppler shifts are used for control purposes. The traces are 
Pedestrian then fed to LSTM neural network for training, validation and testing. 
Radar Generally, the proposed method was able to detect pedestrian with 88.9% 

accuracy for training and 87.3% accuracy for testing. 


Doppler 
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1. INTRODUCTION 

Pedestrian detection is among the most vital safety element in an increasingly complex driver 
assist systems [1-2]. These ensure that the vehicle would detect pedestrians in blind spots and 
perform evasive maneuvres when required; whether through warning systems [3] or emergency 
braking mechanisms [4]. Thus far, numerous sensing approaches have been tested which include 
computer vision [5], laser scanner [6] and automotive radar [7] technologies. Each of these methods has 
unique capabilities that enable the vehicle to detect pending collisions with pedestrians [8]. For example, 
radar sensors enable the utilization of Doppler and micro-Doppler information obtained from body 
movements to identify and discriminate between signals reflected between from pedestrians and 
other targets [9-10], making them a suitable candidate for this purpose. Hybrid systems were also proposed to 
improve pedestrian detection capabilities [11-12], however radar sensors remain an attractive choise for their 
ability to obtain unique signature from reflected signals. 

Embedded in these innovative sensing approaches are intelligent algorithms; designed to 
automatically detect and perform evasive maneuvres [13]. Thus far, the two-dimensional information 
acquired from imagery information has also been tested using advanced artificial intelligent models such as 
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the convolutional neural network (CNN) [14]. Despite implementation of LSTM in CNN architectures, the 
systems still rely on two-dimensional imagery inputs which result in high computational requirements [8]. 
Two major issues have been identified. 1) The use of radar as the primary sensing element is limited since it 
relies only on the reflected time-domain signals from targets [15]. Current technology only adopts it as 
support to the computer vision and laser scanning systems. 2) With radar as the primary input, an LSTM 
neural network is most suited as the architecture is capable of extracting common features of approaching 
pedestrians from the sequential information [16-17]. These however, remain untested. 

To solve the aforementioned problems, the following objectives are outlines. 1) The study 
proposes a relatively simple continuous-wave Doppler radar to characterize between approaching pedestrians 
and controlled condition. 2) The returning pulses deflected off the subjects will used as input to train, 
validate and test the LSTM recurrent neural network architecture. This paper is structured as the following. 
Section 2 describes on the data collecton and intelligent classification method used for the study. 
Subsequently, Section 3 discusses on spectral trace characteristics and subject detection using LSTM neural 
network. Finally, Section 4 summarizes contribution of the study and its prospective application for driver 
assist technology. 


2. METHODOLOGY 
2.1. Experimental setup and acquisition protocol 

Data collection was performed at the Microwave Research Institute, Universiti Teknologi MARA. 
The equipments used include Agilent MXG Analog Signal Generator, KeySight FieldFox Microwave 
Analyzer, as well as transmitter and a receiver antenna. As shown in Figure 1, the radar system 
is placed in a monostatic configuration. An absorber is positioned between transmitter and receiver to 
minimize spectral leakage. A 10 dBm continuous wave with frequency of 1.9 GHz from the signal generator 
is transmitted by a Vivaldi antenna. The returning waves deflected from the approaching subjects are 
captured by the receiver and the information is converted to spectrum traces by the microwave analyzer. 
Spectral resolution is set to 500 Hz to allow observable Doppler shift signatures as subjects move closer to 
the radar setup. 
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Figure 1. Experimental setup 


Eight volunteers have participated in this study. Subjects are required to walk along the specified 
path from Point A to Point B at moderate pace. The system captures the returning signal of approaching 
subjects until they pass right in front of the assumed vehicle’s line of sight; thus simulating the situation in 
that would probably result in a collision. Each subject is required to repeat the trials twenty times and every 
trial will produce between six and eight spectrum traces. 


2.2. Pedestrian detection using LSTM neural network 

LSTM is an improvement of the recurrent neural network (RNN) used for modelling 
sequential data. Figure 2 shows the theoretical architecture of RNN with the recurrent layer unfolded 
into a network [18-19]. U, V and W are hyperparameters of different network layers. x is the input and h is 
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the hidden state that grants the network memory ability. The different time instances are indicated by 
t-1, t and t +1. Through activation function, I, the output of hidden layer with present information is 
transferred to the hidden layer of the next time instance as part of the input. The feedback preserves 
the information of preceding time instance to retain data dependency; thus, improving learning and 
abstracting from the sequential data [20-21]. The vanishing gradient issues during computation of 
back-propagation learning however, adversely affect the amount of distant memories to be transferred. 
Therefore, these restrict the capability of RNN for modelling long-dependency sequential information and 
not suitable to be implemented this study [22]. 


Output (i) 


Unfold 


Figure 2. Hidden state of RNN structure [20] 


To solve the vanishing gradient issue, LSTM neural network has been proposed. A standard LSTM 
block shown in Figure 3 is comprised of memory cell state, forget gate, input gate and output gate. 
The memory state plays a defining role throughout the entire chain in selectively adding or removing relevant 
information to the cell state through the three-gate system [23]. 


Figure 3. Hidden state of LSTM structure [20] 
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Initially as shown by (1), cell state, C, decide on information that should be discarded from previous 
cell state, C,_, through the forget gate, f,. 


f, =T>(Wpx, +U ph, +b,) (1) 


Subsequently as expressed by (2), the input gate, i, identifies the information from input x; that 
should be stored in the cell state, C,. Input information, i, and candidate cell, C, is then updated through (3). 


i, =T,(W;x, +U,h,_) +b;) (2) 


C, =, (W,x, +U .h,, +5, ) (3) 


Subsequently as shown by (4), the combined candidate memory, C, and the long-term memory from 
C;_1 is updated for cell state, C,. 


C, =f, *C,4 +i, *C, (4) 


The output at present time instant, h, is then computed by considering both the output information 
Oz and cell state, C,. These are mathematically expressed by (5) and (6). 


0, =T,(W,x, +U hy; +b,) (5) 

h, =0,*1, (C,) (6) 

Based on the aforementioned equations, f, i and o each represents the forget gate, input gate, and 
output gate. W are the input weights, U are the recurrent weights, and b are the biases for the respective gates 


and cell states. T; is hyperbolic tangent and I, sigmoid function. Both activation functions are used to 
improve non-linearity of the network and can each be expressed by (7) and (8). 


Co —— (7) 
e +e 
1 
T,(x)=——— (8) 
l+e 


LSTM architecture that incorporates memory cells and regulated by the gating mechanism provides 
solution to the vanishing gradient problem of RNN. Thus, the improved network structure is capable to 
extract historical information and predicts future trend for long-term dependencies of sequential data. In this 
study, the input to the LSTM neural network is the spectral traces obtained from the spectrum analyzer. 
The output classes from hidden states are defined as indexes for pedestrian and the controlled condition. 70% 
of the data is used for training, 15% is used for validation, and the remaining 15% is used for testing [24]. 

The performance of LSTM recurrent neural network for pedestrian detection is assessed in terms of 
accuracy (Acc), positive predictivity (Pp), and sensitivity (Se). Acc is described as the ability of the system to 
correctly differentiate between approaching subjects and control condition. Subsequently, Se is defined as the 
ability of the system to correctly identify approaching pedestrians. On the other hand, Pp is described as the 
probability of that following a positive detection, the subject will be within the line of sight of the vehicle. 
Each of these parameters is expressed by (7-9), where TP is true positive, TN is true negative, FP is false 
positives and FN is false negative classification [25]. 


a ree . 100% (9) 
TP + TN + FP + FN 
See a) x 100% 
TP + FP ies 
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(11) 


3. RESULTS AND DISCUSSION 
3.1. Spectral profiling using doppler shift signature 

In theory, the signal deflected off subjects moving farther away from Doppler radar will exhibit 
longer wavelength than the transmitted signal. These should be reflected in the presence of secondary peak 
with frequency characteristics lower than 1.9 GHz. In contrast, subjects moving closer towards the Doppler 
radar will exhibit shorter wavelength than the transmitted signal. These could be be characterized by the 
presence of secondary peak with frequency higher than 1.9 GHz. Figure 4 shows a sample of the spectrum 
trace obtained from KeySight FieldFox Microwave Analyzer. Doppler shift can be seen visible at frequency 


higher than 1.9 GHz. The result is thus valid as the waves deflected conform to the characteristics of an 
approaching pedestrian. 
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Figure 4. Spectrum traces for (a) controlled condition, and (b) approaching subject 


To further confirm the collective pattern of acquired data, results from each sample are combined to 
form a composite display of spectrum traces. Figure 5 shows the overall spectrum traces for controlled 
condition. The results show a consistent pattern with a dominant peak at frequency of 1.9 GHz. 
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Figure 5. Composite spectrum traces for controlled condition (N = 1108 samples) 
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The results were also compared with collective pattern of spectrum traces for approaching subjects. 
As shown in Figure 6, an increase in spectrum activity is detected at frequencies higher than 1.9 GHz. 
These provide a conclusive proof that the Doppler radar is indeed capturing the correct information from 
deflected signals. 
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Figure 6. Composite spectrum traces for approaching subjects (N = 1108 samples) 


3.2. Pedestrian detection using LSTM recurrent neural network 

The spectrum traces which are assumed as sequential information is subsequently fed as input to the 
LSTM neural network. As shown in Table 1, satisfactory results have been obtained with 88.9% Acc for 
training, 88.9% Acc for validation, and 87.3% Acc for testing. It is also worth noting that both Se and Pp 
measures range between 76.8% to 99.4% when detecting between subject or controlled condition. 
These indicate that the model is capable of detecting approaching subjects by extracting the Doppler shift 
information from the respective spectrum traces. 


Table 1. Performance of LSTM neural network for pedestrian detection 


Parameters Pedestrian Control Acc 
a 
Validation a aes oe 88.9% 

a 


4. CONCLUSION 

The study initially sets out to 1) implement Doppler radar as primary sensing element for detecting 
approaching pedestrian, and 2) assess the performance of LSTM neural network for extracting sequential 
information from spectrum traces for distinguishing between incoming subject and controlled condition. 
Through a relatively simple experiment setup, the study was able to produce satisfactory results. First, the 
adopted radar system was capable of capturing Doppler shift signatures through the spectrum analyzer. 
Second, the LSTM neural network has proven capable of extracting the required information for detecting 
approaching pedestrians. 

While the overall detection accuracy is satisfactory, there is still opportunity for improvement. 
Based on the observation of spectrum traces, there are samples in which the Doppler shift is not prominent. 
Hence, these are presented as outliers that exist within the broad range of samples. Furthermore, the network 
had to rely on relatively small sample size for capturing relevant information. To overcome these problems, 
a larger pool of samples is recommended. This is to ensure that the LSTM architecture is capable of 
extracting the long-term dependency characteristics of the sequential information and successfully generalize 
the features of incoming pedestrians. 
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